Boosted trees for ecological modeling and prediction.
نویسنده
چکیده
Accurate prediction and explanation are fundamental objectives of statistical analysis, yet they seldom coincide. Boosted trees are a statistical learning method that attains both of these objectives for regression and classification analyses. They can deal with many types of response variables (numeric, categorical, and censored), loss functions (Gaussian, binomial, Poisson, and robust), and predictors (numeric, categorical). Interactions between predictors can also be quantified and visualized. The theory underpinning boosted trees is presented, together with interpretive techniques. A new form of boosted trees, namely, "aggregated boosted trees" (ABT), is proposed and, in a simulation study, is shown to reduce prediction error relative to boosted trees. A regression data set is analyzed using ABT to illustrate the technique and to compare it with other methods, including boosted trees, bagged trees, random forests, and generalized additive models. A software package for ABT analysis using the R software environment is included in the Appendices together with worked examples.
منابع مشابه
Modeling the Prevalence of Avian Influenza in Guilan Province Using Data Mining Models and Spatial Information System in 2016: An Ecological Study
Background and Objectives: Infection of birds to Highly Pathogenic Avian Influenza (HPAI) and their extinction impose heavily losses on the livestock and poultry industry along with public health. Nowadays, due to the volume and variety of data, the need of using location-based technologies and data mining sciences has become inevitable. This study aims to model the prevalence of avian influenz...
متن کاملRegional data refine local predictions: modeling the distribution of plant species abundance on a portion of the central plains.
Species distribution models are frequently used to predict species occurrences in novel conditions, yet few studies have examined the consequences of extrapolating locally collected data to regional landscapes. Similarly, the process of using regional data to inform local prediction for species distribution models has not been adequately evaluated. Using boosted regression trees, we examined er...
متن کاملIncorporating Boosted Regression Trees into Ecological Latent Variable Models
Important ecological phenomena are often observed indirectly. Consequently, probabilistic latent variable models provide an important tool, because they can include explicit models of the ecological phenomenon of interest and the process by which it is observed. However, existing latent variable methods rely on handformulated parametric models, which are expensive to design and require extensiv...
متن کاملComparing Different Modeling Techniques for Predicting Presence-absence of Some Dominant Plant Species in Mountain Rangelands, Mazandaran Province
In applied studies, the investigation of the relationship between a plant species and environmental variables is essential to manage ecological problems and rangeland ecosystems. This research was conducted in summer 2016. The aim of this study was to compare the predictive power of a number of Species Distribution Models (SDMs) and to evaluate the importance of a range of environmental variabl...
متن کاملCredit scoring with boosted decision trees
The enormous growth experienced by the credit industry has led researchers to develop sophisticated credit scoring models that help lenders decide whether to grant or reject credit to applicants. This paper proposes a credit scoring model based on boosted decision trees, a powerful learning technique that aggregates several decision trees to form a classifier given by a weighted majority vote o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Ecology
دوره 88 1 شماره
صفحات -
تاریخ انتشار 2007